-
Couldn't load subscription status.
- Fork 25.6k
Add RerankRequestChunker #130485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RerankRequestChunker #130485
Conversation
|
@elasticmachine merge upstream |
...ference/src/main/java/org/elasticsearch/xpack/inference/action/TransportInferenceAction.java
Outdated
Show resolved
Hide resolved
...pack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java
Outdated
Show resolved
Hide resolved
|
@elasticmachine merge upstream |
|
@elasticmachine merge upstream |
|
@elasticmachine merge upstream |
...pack/inference/rank/textsimilarity/TextSimilarityRankFeaturePhaseRankCoordinatorContext.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/xpack/inference/services/elasticsearch/ElasticRerankerServiceSettings.java
Outdated
Show resolved
Hide resolved
...inference/src/main/java/org/elasticsearch/xpack/inference/chunking/RerankRequestChunker.java
Outdated
Show resolved
Hide resolved
|
@elasticmachine merge upstream |
|
Pinging @elastic/ml-core (Team:ML) |
|
Hi @dan-rubinstein, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| private RankedDocsResults parseRankedDocResultsForChunks(RankedDocsResults rankedDocsResults) { | ||
| List<RankedDocsResults.RankedDoc> updatedRankedDocs = new ArrayList<>(); | ||
| Set<Integer> docIndicesSeen = new HashSet<>(); | ||
| for (RankedDocsResults.RankedDoc rankedDoc : rankedDocsResults.getRankedDocs()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be safe and ensure the highest scoring chunk is used rankedDocsResults should be sorted. The results almost certainly will be sorted but just in case.
The sorting could be done in the RankedDocsResults constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, good catch, I added the sort at the end of this function but it should be in the construction to cover cases when the results aren't sorted but it should be in the rankedDocsResults.getRankedDocs() call to ensure we are taking the top result for each doc. I'll update this to sort the ranked docs before looping and will also update the updatedRankedDocs to be topRankedDocs as I think that's a bit clearer on what we're trying to store.
Issue - #121567
This change adds the ability to use chunking for the elastic reranker as an alternative long document handling strategy to the existing truncation method. To enable chunking you must include the
long_document_strategy(with the value set tochunk) in theservice_settingsof the rerank inference endpoint being used to perform inference. The value can also be set manually totruncateto force chunking but this is currently the default behavior. Themax_chunks_per_docvalue can optionally be included to limit the number of chunks that are sent for inference per document. If this value is not set then all chunks generated for the document will be sent. For example:When using chunking, documents will be chunked before inference and the chunks (either all or some depending on whether
max_chunks_per_docis set) will be sent for inference. For each document, the relevance score returned to the user will be the maximum score for any given chunk within the document.Testing
truncateselected for long document strategy and ensured that truncation worked as expected.chunkselected for long document strategy and ensured that documents were chunked and all chunks were sent for inference.chunkselected for long document strategy andmax_chunks_per_docset and ensured that subset of chunks were sent for inference.elasticsearchservice endpoint and ensured that inference is still working.